AITopics | approximate model

Multi-Step Generalized Policy Improvement by Leveraging Approximate Models Lucas N. Alegre 1, 2 Ana L. C. Bazzan 1 Ann Now é 2 Bruno C. da Silva 3 1

Neural Information Processing SystemsFeb-15-2026

We introduce a principled method for performing zero-shot transfer in reinforcement learning (RL) by exploiting approximate models of the environment. Zero-shot transfer in RL has been investigated by leveraging methods rooted in generalized policy improvement (GPI) and successor features (SFs).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
South America > Brazil > Rio Grande do Sul (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(8 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:45 GMT

algorithm, relation hold, value update, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:37 GMT

Real Time Dynamic Programming (RTDP) is an online algorithm based on Dynamic Programming (DP) that acts by 1-step greedy planning.

artificial intelligence, machine learning, skt, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

77c7faab15002432ba1151e8d5cc389a-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 22:40:20 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
South America > Brazil > Rio Grande do Sul (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(8 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 12:39:58 GMT

algorithm, relation hold, value update, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 12:39:51 GMT

algorithm, h-lookahead policy, rtdp, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)

Add feedback

FAMES: Fast Approximate Multiplier Substitution for Mixed-Precision Quantized DNNs--Down to 2 Bits!

Ren, Yi, Xu, Ruge, Guo, Xinfei, Qian, Weikang

arXiv.org Artificial IntelligenceDec-6-2024

A widely-used technique in designing energy-efficient deep neural network (DNN) accelerators is quantization. Recent progress in this direction has reduced the bitwidths used in DNN down to 2. Meanwhile, many prior works apply approximate multipliers (AppMuls) in designing DNN accelerators to lower their energy consumption. Unfortunately, these works still assume a bitwidth much larger than 2, which falls far behind the state-of-the-art in quantization area and even challenges the meaningfulness of applying AppMuls in DNN accelerators, since a high-bitwidth AppMul consumes much more energy than a low-bitwidth exact multiplier! Thus, an important problem to study is: Can approximate multipliers be effectively applied to quantized DNN models with very low bitwidths? In this work, we give an affirmative answer to this question and present a systematic solution that achieves the answer: FAMES, a fast approximate multiplier substitution method for mixed-precision DNNs. Our experiments demonstrate an average 28.67% energy reduction on state-of-the-art mixed-precision quantized models with bitwidths as low as 2 bits and accuracy losses kept under 1%. Additionally, our approach is up to 300x faster than previous genetic algorithm-based methods.

appmul, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.18055

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Exploring DNN Robustness Against Adversarial Attacks Using Approximate Multipliers

Askarizadeh, Mohammad Javad, Farahmand, Ebrahim, Castro-Godinez, Jorge, Mahani, Ali, Cabrera-Quiros, Laura, Salazar-Garcia, Carlos

arXiv.org Artificial IntelligenceApr-17-2024

Deep Neural Networks (DNNs) have advanced in many real-world applications, such as healthcare and autonomous driving. However, their high computational complexity and vulnerability to adversarial attacks are ongoing challenges. In this letter, approximate multipliers are used to explore DNN robustness improvement against adversarial attacks. By uniformly replacing accurate multipliers for state-of-the-art approximate ones in DNN layer models, we explore the DNNs robustness against various adversarial attacks in a feasible time. Results show up to 7% accuracy drop due to approximations when no attack is present while improving robust accuracy up to 10% when attacks applied.

accuracy, adversarial attack, multiplier, (15 more...)

arXiv.org Artificial Intelligence

2404.11665

Country:

North America > Costa Rica > Cartago Province > Cartago (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Iran > Kerman Province > Kerman (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Model approximation in MDPs with unbounded per-step cost

Bozkurt, Berk, Mahajan, Aditya, Nayyar, Ashutosh, Ouyang, Yi

arXiv.org Artificial IntelligenceFeb-13-2024

We consider the problem of designing a control policy for an infinite-horizon discounted cost Markov decision process $\mathcal{M}$ when we only have access to an approximate model $\hat{\mathcal{M}}$. How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$? We answer this question by bounding a weighted norm of the difference between the value function of $\hat{\pi}^\star $ when used in $\mathcal{M}$ and the optimal value function of $\mathcal{M}$. We then extend our results and obtain potentially tighter upper bounds by considering affine transformations of the per-step cost. We further provide upper bounds that explicitly depend on the weighted distance between cost functions and weighted distance between transition kernels of the original and approximate models. We present examples to illustrate our results.

approximation, assumption 1, denote, (16 more...)

arXiv.org Artificial Intelligence

2402.08813

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
(6 more...)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Bayesian score calibration for approximate models

Bon, Joshua J, Warne, David J, Nott, David J, Drovandi, Christopher

arXiv.org Machine LearningOct-27-2023

Scientists continue to develop increasingly complex mechanistic models to reflect their knowledge more realistically. Statistical inference using these models can be challenging since the corresponding likelihood function is often intractable and model simulation may be computationally burdensome. Fortunately, in many of these situations, it is possible to adopt a surrogate model or approximate likelihood function. It may be convenient to conduct Bayesian inference directly with the surrogate, but this can result in bias and poor uncertainty quantification. In this paper we propose a new method for adjusting approximate posterior samples to reduce bias and produce more accurate uncertainty quantification. We do this by optimizing a transform of the approximate posterior that maximizes a scoring rule. Our approach requires only a (fixed) small number of complex model simulations and is numerically stable. We demonstrate good performance of the new method on several examples of increasing complexity.

adjust-post, approximate posterior, posterior, (16 more...)

arXiv.org Machine Learning

2211.05357

Country: